Expand CRF to Model Long Distance Dependencies in Prosodic Break Prediction
نویسنده
چکیده
Intonation phrase length distribution is important information for prosodic break prediction. However, existing CRF frameworks cannot make full use of it. An expanded CRF is proposed in this paper to tackle this problem. Its lattice carries the location of previous intonation phrase (L3) break, and consequently makes it possible to support various dynamic features, such as the number of syllables from the previous L3 break and the POS of word after the previous L3 break. Remarkable improvements are obtained with the expanded CRF for L3 break prediction task. It is also promising to benefit other tasks containing long distance dependencies.
منابع مشابه
Automatic Prosodic Labeling with Conditional Random Fields and Rich Acoustic Features
Many acoustic approaches to prosodic labeling in English have employed only local classifiers, although text-based classification has employed some sequential models. In this paper we employ linear chain and factorial conditional random fields (CRFs) in conjunction with rich, contextually-based prosodic features, to exploit sequential dependencies and to facilitate integration with lexical feat...
متن کاملAutomatic Prosodic Labeling with Conditional Random Fields and Rich Acoustic Features
Many acoustic approaches to prosodic labeling in English have employed only local classifiers, although text-based classification has employed some sequential models. In this paper we employ linear chain and factorial conditional random fields (CRFs) in conjunction with rich, contextually-based prosodic features, to exploit sequential dependencies and to facilitate integration with lexical feat...
متن کاملProsodic Words Prediction from Lexicon Words with CRF and TBL Joint Method
Predicting prosodic words boundaries will directly influence the naturalness of synthetic speech, because prosodic word is at the lowest level of prosody hierarchy. In this paper, a Chinese prosodic phrasing method based on CRF and TBL model is proposed. First a CRF model is trained to predict the prosodic words boundaries from lexicon words. After that we apply a TBL based error driven learnin...
متن کاملUsing multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework
We model Mandarin phrase break prediction as a classification problem with three level prosodic structures and apply conditional maximum entropy classification to this problem. We acquire multiple levels of linguistic knowledge from an annotated corpus to become well-integrated features for maximum entropy framework. Five kinds of features were used to represent various linguistic constraints i...
متن کاملAn Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition
This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local dependencies, while being much more computationally efficient. NER systems typically use sequence models for tractable inference, but this makes them unable to capture the long distance structure present in text. We use a C...
متن کامل